Generate Aggregation (Advanced) (Operator Toolbox)
Synopsis
This operator allows you aggregate a selection of attributes per example.Description
This operator works very similar to the Generate Aggregation operator. Both operators allow you aggregate per example (i.e row based). You can select the aggregation function using the 'function' parameter. Each function may have one or more parameters. The name of the generated attribute is always the function name you selected.
The differentiation to Generate Aggregation is, that this operator allows you more complex aggregation functions.
The differentiation to Aggregate is, that this operator allows you to aggregate row-wise, while aggregate is performing aggregations on a columnar basis.
Input
- exa (Data table)
The ExampleSet you want to perform the aggregation on.
Output
- exa (Data table)
The ExampleSet with aggregations
- ori (Data table)
The original ExampleSet.
Parameters
- function
This parameter allows you to set the aggregation function. Note that the name of the generated attribute is always the function name
- kth min: Calculates the kth minimum of a set of values. Example: If the input data is 30,50,10,2 and you select k=2, you get the 2nd smallest value; In this case, 10.
- kth max: Calculates the kth maximum of a set of values. Example: If the input data is 30,50,10,2 and you select k=2, you get the 2nd biggest value; In this case, 30.
- k Configuration for 'kth-min' and 'kth-max' aggregation function. Used to define which minimum to extract.
Tutorial Processes
Generate 2nd Minimum of a Random Data Set
In this example we generate random data with attribute names att1, att2, att3, att4 and att5. We use the Generate Aggregation (Advanced) operator to calculate the 2nd minimum (2nd smallest value) for attribute att2, att3, att4 and att5 in each row. Note that att1 is not selected.